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Abstract — Dealing with pollution attacks in inter-session net- 
work coding is challenging due to the fact that sources, in 
addition to intermediate nodes, can be malicious. In this work, 
we precisely define corrupted packets in inter-session pollution 
based on the commitment of the source packets. We then propose 
three detection schemes: one hash-based and two MAC-based 
schemes: InterMaccpx and SpaceMac PM . InterMaccpK is the first 
multi-source homomorphic MAC scheme that supports multiple 
keys. Both MAC schemes can replace traditional MACs, e.g., 
HMAC, in networks that employ inter-session coding. All three 
schemes provide in-network detection, are collusion-resistant, and 
have very low online bandwidth and computation overhead. 

I. Introduction 

Network coding involves packets being combined at interme- 
diate nodes inside the network. Depending on whether packets 
from the same or different sessions are mixed, network coding 
is classified as intra-session or inter-session, respectively. Inter- 
session coding, that is the focus of this paper, has been 
implemented in practice, such as in wireless mesh networks 
QJ, (2) and streaming gestures J3]. 

The mixing nature of network coding makes it extremely 
vulnerable to pollution (a.k.a. Byzantine modification) attacks. 
In such an attack, malicious nodes inject corrupted packets 
that then are combined and forwarded by downstream nodes, 
eventually resulting in a large number of corrupted packets 
propagating in the network. This wastes network resources, 
such as bandwidth and CPU time. More critically, it prevents 
receivers from decoding the original packets. A large body of 
work has focused on pollution attacks in intra-session coding 
J4|-p5), while pollution attacks in inter-session coding have 
received significantly less attention p7)-p9). 

In this paper, our goal is to detect pollution attacks in inter- 
session network coding using cryptographic primitives. This is 
particularly challenging because not only intermediate nodes 
but also sources can be malicious and initiate attacks them- 
selves. Recently, Agrawal et al. [27] formulated the problem 
for the first time and presented a detection scheme based on 
homomorphic signatures. This scheme has high computation 
overhead due to many public -key signature verification and 
modular exponentiation operations performed at each node per 
packet. Furthermore, the signature size is large and does not 
scale as it increases linearly in the number of sources and 
packets sent by them. 

In this paper, we introduce three novel detection schemes: 
one hash-based and two MAC- based schemes, all of which are 



significantly more efficient than ]27) . The key ingredient of our 
approaches is the use of commitment (to a trusted controller) of 
source packets. This commitment allows us to precisely define 
corrupted packets, thereby enabling detection of all corrupted 
packets, including some that |27| cannot detect. We build upon 
this idea and design three schemes: 

• A hash-based detection scheme, that combines homomor- 
phic 1 30 1 and traditional hash functions, e.g., SHA-1. 

• InterMaccpK, a multi-source homomorphic MAC scheme. 
It is the first homomorphic MAC scheme that allows tags 
to be generated under different keys. 

• SpaceMac PM , a combination of an existing inner-product 
homomorphic MAC scheme (built for intra-session coding 
J25) ) and a private inner-product protocol pT) . 

Our hash-based scheme allows nodes to detect corrupted 
packets right after they receive them, thus providing in-network 
detection. Both of our MAC schemes can replace traditional 
MACs, e.g., HMAC, to provide end-to-end detection. Moreover, 
they can be used as building blocks for other schemes that 
provide in-network detection, such as fT7| , fl8) , (20j| and 



|24|. The hash-based detection scheme is arbitrarily collusion- 
resistant. Meanwhile, depending on the in-network detection 
scheme used, a scheme built on one of the MAC schemes 
could be either arbitrarily collusion-resistant or c-collusion- 
resistant, for a predetermined small c. We also custom design 
commitment schemes that offer high bandwidth efficiency for 
both MAC schemes. Most importantly, all proposed schemes 
have significantly higher bandwidth and computation efficiency 
than those of the state-of-the-art detection scheme for inter- 
session coding [27]. In particular, simulation results show that 
for a detection scheme built on one of our MAC schemes, both 
the online bandwidth and computation overhead are low, as low 
as 3% and 4 ms, respectively. 

The proposed schemes provide alternative approaches to 
detect corrupted packets in inter-session network coding. In 
general, the MAC-based schemes have significantly lower 
computation overhead than the hash-based scheme (Section 
VI-B[). Spa ceMac PM offers lower commitment overhead (Sec- 



tion 



VTA}, but InterMaccpK is less vulnerable to colluding 



malicious receivers (end of Section |V-D| i. 

The rest of this paper is organized as follows. In Section 
[111 we discuss related work. In Section [Til] we describe the 
network operations, threat models, and definition of corrupted 
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packets. In Section IV we present the proposed hash-based 
detection scheme. In Section [V] we describe InterMaccpK and 
SpaceMac PM . In Section VI we evaluate the performance of 



our schemes. Finally, we conclude in Section \VU\ 

II. Related Work 

Because pollution attacks pose a severe threat to the success 
of network coding, a large body of research has been devoted to 
designing defense mechanisms, including both information the- 
oretic and cryptographic approaches. The existing approaches 
provide error-correction capability iQ-Q, attack detection (8)- 
20), (23), (24), (27), and attacker identification (21), (22), 
[29). Most of these approaches, including our prior work 
25J, are proposed for intra-session coding and are not 
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applicable to inter-session coding, as discussed in Section |HI-C 



We refer the reader to [23 1 for a comprehensive overview of 
intra-session defense mechanisms. Here, we focus on defense 
against pollution attacks in inter-session network coding. 

Agrawal et al. [27] proposed a homomorphic signature 
scheme to provide in-network detection for inter-session net- 
work coding. In their scheme, the signature of a packet sent by 
a source S consists of g hash values of all g source packets 
sent by S, together with the public key signature of the hash 
values. The hash values are computed using a homomorphic 
hash function proposed in (30) . The signature of the hash is 
computed using a secure signature scheme. The signature a y of 
a packet y, which is a linear combination of packets belonging 
to £ different flows, is the concatenation of the signatures of 
£ different signatures. The main drawbacks of this scheme are 
(i) the expensive verification: the verification of a y involves 
£ public-key signature verification and one homomorphic hash 
verification, and (ii) the large signature size: the size of a y is 
large, including £ public-key signatures and g£ hash values. 

The approaches proposed in this paper are inherently dif- 
ferent from (27) . We leverage the commitment of source 
packets and build our detection schemes based on un-key and 
symmetric-key cryptographic primitives as opposed to public- 
key primitives. We significantly improve the bandwidth and 



computation efficiency over (27) (Section VI I. Furthermore, by 



precisely defining corrupted packets, our schemes are able to 



detect some corrupted packets that (27) cannot (Section III-D I. 

Dong et al. [29] proposed a scheme that allows for identify- 
ing malicious nodes in inter-session network coding. When a 
pollution is detected, a bit-level traceback procedure is executed 
to identify the attacker. Our detection schemes are orthogonal 
and complementary to this identification scheme. 

III. Problem Formulation 

A. Network Model and Operation 

Some of the notation we use are from (23) and (27). Consider 
a graph denoted by Q = (V, S). There are s pairs of source- 
receiver in the network, denoted by (Si,Ri),i € [1, s]. Each 
source, Si, sends packets to its corresponding receiver, Ri, by 
first dividing the packets into generations. For simplicity, we 
assume that all sources use the same generation size, g. It is 
straightforward to extend our defense schemes to accommodate 



different generation sizes. Si interprets its packets in a single 
generation, v 



iJ ^ [Ijfl 1 ]' as vectors in a n-dimensional 
vector space over a finite field ¥ q . Before sending, Si appends 
to Vi j its coding coefficient, forming g augmented packets, 
v,-,i,- • • ,Vj,„: j 
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We refer to the augmented packets, v,j's, as source packets 
and Vjj as data of v$ 3 \ We use aug(vjj) to denote the coding 
coefficients of Vij. 

def 

Note that for each generation, there are m = sg source 
packets. The sources send source packets into the network gen- 
eration by generation. Intermediate nodes in the network per- 
form generation-based linear network coding, i.e., they linearly 
combine packets that belong to the same generation. Packets 
sent from different sources may be combined by intermediate 
nodes. For example, when an intermediate node TV receives £ 
packets, wi, • • • , we, which are some linear combinations of 
the source packets sent by any set of sources, it chooses £ 
local coding coefficients, a%,- • • , on, depending on the coding 
scheme used, and then transmit y = 53i=i a i w i to one or more 
of its outgoing edges. Note that if y is a linear combination 
of the source packets v^/s then the last m symbols of y 
contain its global coding coefficients. For clarity, we focus on 
the transmission of a single generation by all the sources. 

Let the subspace spanned by the source packets be II 
span(vi i, • • • , v Sj9 ) and the subspace spanned by the data of 

def 

the source packets be II = span(v ll5 --- , v sff ). We refer 
to II as the source space and II as the source data space. 
When all nodes in the networks are benign, all packets in 
the network belong to the source space. A receiver, Ri, can 
decode the original packets sent by its corresponding source Si 
after collecting enough packets. In particular, after collecting 
m linearly independent packets, Ri can decode the original 
packets by applying Gaussian elimination on the m x (n + m) 
matrix formed by the collected packets. Ri may also be able to 
decode using less than m linearly independent packets because 
Ri is not interested in packets sent by the other sources. 

B. Inter-Session Network Coding Characteristics 

In inter-session network coding, it is often the case that 
intermediate nodes are able to decode source packets from 
the received coded packets. For instance, in COPE [1|, every 
encoded packet is decoded at the next hop. There are also 
other coding schemes where encoded packet are decoded by 
either the first hop or the second hop, e.g., see (2) and [3|. 
Furthermore, in inter-session coding, source packets of a source 
Si may not traverse the whole network but only some parts of 
the network: for instance, in a directed acyclic graph, packets 
sent from Si should not travel to nodes that have no path to 
Ri. We will exploit these observations later in the proposed 
schemes. 

Finally, the most important observation is that, in inter- 
session network coding, not only intermediate nodes but also 
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Fig. 1 . An example of pollution attack in inter-session network coding. Source 
S2 is malicious and all other nodes are benign. S2 pollutes the flow S1-R1 
by injecting conflicting source packets (v^O, 1) and (v2,0, 1). i?i decodes 
and recovers incorrect vi. 

some sources may be malicious. This differentiates the scenario 
we study in this work from single-source intra-session coding. 
We explicitly take this observation into account in our threat 
model below. 

C. Threat Model 

We assume that up to s — 1 sources could be malicious, 
any intermediate node may be malicious, and the receivers 
are trusted. To pollute the network, the malicious nodes may 
generate and inject any type of traffic into the network; they 
may also collude among themselves. We assume the attackers 
know about the construction of any cryptographic primitive 
used but the attackers' running time is polynomial in the 
security parameter of cryptographic primitives. 

Example Attack. Fig. [T] depicts the classic butterfly network 
coding across two unicast sessions. There are two sources: Si is 
benign, but S 2 is malicious. A, B, R\, and i? 2 are benign. The 
generation size is 1. Only node A combines incoming packets, 
and only R\ and R2 decode. Local coding coefficients at A are 
fixed: «i = a-2 = 1. Packets sent by the nodes are annotated 
on the edges. In this example, S 2 successfully pollutes the 
network because it causes an incorrect decoding at Ri. More 
specifically, by subtracting (v 2 , 0, 1) from (vi + v 2 , 1, 1), Ri 



receives vi 



v 2 instead of vi. 



Intra-Session Detection Failure. Both unkey cryptographic 
approaches and key-based cryptographic approaches developed 
for intra-session fail to detect corrupted packets in the inter- 
session threat model. The ways they fail, however, are different. 
We first consider applying the hash-based scheme proposed in 
fl2) . Prior to the transmission, A, B, R l7 and R 2 download 
the hash of vi from S\ and hash of V2 from S2. S2 can act 
maliciously by sending to R\ the hash of V2 but sending to 
A and B the hash of v 2 . This makes A accept (v 2 ,l,0), B 
accept (vi + v 2 , 1, 1), and Ri accept (v 2 , 1, 0). Therefore, 5 2 
can still carry out the same attack. 

Now, let us consider applying any of the proposed MAC 
or signature-based approaches, such as, fTO) , |15|-|18|, |20|, 
(23). When using any one of these schemes, MAC tags or 
signatures of packets must be generated under the same (private 
or symmetric) secret key so that the homomorphic property of 
the scheme holds. But if this is the case, a malicious source 
knowing the key can generate a valid tag/signature of any 
packet of its interest and pollute the network. For instance, 
5*2 can send to R\ (v' 1; 1,0) and its valid tag/signature, where 
vj ^ v'-y, and R\ will accept this corrupted packet. 



D. Corrupted Packet 

Loosely speaking, we consider any packet that causes a 
pollution of flows from benign sources corrupted. Nevertheless, 
in order to detect a pollution attack, corrupted packets must be 
precisely defined. We first require that each source, Si, commits 
to its source packets before the transmission. We then define 
a corrupted packet based on this commitment, (i) In our hash- 
based scheme, we require each source to commit to the data of 
each of its packets by sending the hash of the data to a trusted 
controller. Let II be the space spanned by the committed data of 
all the sources. We call II the committed source data space, (ii) 
In our MAC -based schemes, we require each source to commit 
to each of its whole packet as opposed to just the data. We 
call the space spanned by all the committed source packets the 
committed source space and denote it by II. 

Definition 1. Let II and II be the committed source data 
space and committed source space, respectively. A packet y 
is considered corrupted // y ^ II or y ^ II. 

The above definition helps us to design detection schemes 
capable of detecting all corrupted packets. For instance, in Fig. 
[T| if 5 2 commits to v 2 then our schemes will help nodes A to 
drop (v 2 , 0, 1), thus avoiding having (vi+v 2 , 1, 1). In contrast, 
the scheme in [27 1 only helps a node to detect conflicting 
packets and does not detect all corrupted packets. For instance, 



if 1 27 1 is used, A and B still accept v 2 and (vi + v 2 , 1, 1), 



respectively, (vi +v 2 , 1, 1) is detected as corrupted at R\ if R\ 
receives v 2 first, or v 2 is detected as corrupted if Ri receives 
(vi + v 2 ,l, 1) first. 

E. Trusted Controller 

Trusted controllers have been used explicitly in previous 
work that identify and eliminate attackers J2T) , (22], p5) . They 
have also been introduced implicitly by other detection schemes 
1 9 1-| 20 1, where a trusted source setups and distributes hash 
values, MAC tags, and keys. In this work, we explicitly uses a 
standalone trusted controller to support the commitment. 

IV. The Hash-Based Detection 
A. Key Observations and Approach 

Observation 1. Let us revisit the discussion of applying 
homomorphic hash functions to inter-session network coding 
in Section |III-C| We observe that the main reason why S2 can 
successfully pollute flow S\-R\ is that S2 is able to distribute 
different hash values of v 2 and v 2 to A, B, and i? x . If all nodes 
in the network receive the same hash value, either hash of v 2 
or v 2 , then 5 2 will not be able to carry out the attack because 
one of the two will be dropped due to incorrect hash. Ensuring 
that all nodes in the network receive the same hash value of v 2 
or v 2 lS m f act equivalent to forcing 5 2 to commit to either v 2 
or v 2 , thus making any linear combination involving the other 
(non-committed) packet a corrupted packet. 

Observation 2. As mentioned in Section |III-B| in inter- 
session network coding, it is often the case that intermediate 
nodes completely decode coded packets and recover their 
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Fig. 2. Commitment and hash distribution for the network of Fig. [T] 

corresponding source packets. We exploit this fact and propose 
to use traditional hash functions to check for the integrity of 
these decodable packets. In other words, instead of verifying 
a coded packet using an expensive homomorphic hash veri- 
fication, a node decodes it and verifies the recovered packet 
using an inexpensive traditional hash verification. Note that a 
traditional hash verification is two to three orders of magnitude 
less expensive than a homomorphic one. This observation is 
especially beneficial to COPE-like coding schemes QJ, where 
every coded packet is decodable by any next hop. 

Approach. Our hash-based detection scheme needs a trusted 
controller. Denote this controller by C. The scheme is based 
on the above observations and works as follows: 

Setup: C sends to every node the description of a homomorphic 
hash function {e.g., "H-DL, described in the next section) as well 
as a traditional hash function, e.g., SHA-1. Before sending, 
each source, Si (i £ [1, s]), augments its data following 
the augmentation scheme described in section [TIT] For every 
source packet, Vy (j £ [1><?])> Si computes a homomorphic 
hash value and a traditional hash value, denoted as hy and 
hij, respectively. Each source then sends both hij and hij to 
C. The commitment of each source are the pairs (hij, hij). 
Every node downloads these pairs from C. We assume that the 
hash descriptions and values are distributed through authentic 
(tampering resistant) channels as usual applications of hash. 
Fig. [2] illustrates how the hashes are distributed for the network 
of Fig. [T] 

Sending: At each node, sending packets, including linearly 
combining incoming packets, is performed as usually. 

Receiving and Verification: Upon receiving a packet y, if a node 
is specified to decode by the coding scheme, it checks if it can 
recover a source packet by decoding using y and its previously 
received packets, (i) If it can, it uses the traditional hash check 
to verify the integrity of the packet, (ii) If it cannot or in the case 
the node is not specified to decode, it uses the homomorphic 
hash check to verify the integrity of y. If the recovered source 
packet (case (i)) or y (case (ii)) passes the verification, the node 
marks y as legitimate and uses it in subsequent transmissions; 
otherwise, it drops y. 

B. Homomorphic Hash Scheme 

A homomorphic hash scheme consists of three polynomial- 
time algorithms: 



• HashSetup(l A , n): Input: unary representation of the secu- 
rity parameter A, and the dimension of the data space n. 
Output: public parameters pp. 

• Hash(pp, v) : Input: public parameters pp and a data vector 
v e F™. Output: hash value, h £ F , of v. 

- The hash of y, a linear combination of m source data 
vectors Vj, i £ [1, m], is a hash vector h = (hi, ■ ■ ■ , h m ), 
where hi = Hash(pp,v i ). 

• Test(pp, y, /?, h) : Input: public parameters pp, a vector 
y £ F™, a vector of coefficient f3 £ F™, and a hash vector 
h £ F™. Output: T (true) or _L (false). 

Intuitively, let h be the set of hashes of the data of the source 
packets. For a packet y with data y and coding coefficients /3, 
if y is a linear combination of the source packets then Test 
should outputs T. Also, it should be difficult for an adversary 
to find a packet y outside of the source space such that Test 
outputs T. 

Correctness. For all pp 4— HashSetup(l A , n), we require the 
following properties for the correctness of the scheme: 

• For all v £ F™, if h = Hash(pp, v) then for all i £ [1, m], 
Test(pp, v, ej, h) = T, where e, is the i-th unit vector 
of the space F™ and the j-th component of h, hV\ is 
defined as follows: hw) equals h if j equals i and equals 
Tj otherwise, where rj is any value in ¥ q . 

• For all yi ,y 2 e F™ ft,A,h€ F™, and a u a 2 _£ F q , 
let y = axfi + a 2 y 2 and (3 = aiA + a 2 /3 2 . We 
require that if Test(pp, y^, j3i, h) = T for i = 1,2 then 
Test(pp,y,/?,h) = T. 

Security. Let T~L = (HashSetup, Hash, Test) be a homomor- 
phic hash. Let A be a probabilistic polynomial time (PPT) 
adversary that takes as input pp <— HashSetup(l A , n) and 
outputs v* £ F" +m , an m-dimensional space V represented 
as basis vectors vi, • • ■ , v m , and a hash vector h £ F™. 

Definition 2. We say that A breaks the homomorphic hash 
scheme W if (i) v* ^ V, (ii) Test(pp, Vj, ej, h) = T for i 
1 . ■ ■ ■ ,771, and (Hi) Test(pp, v*, aug(v*), h) = T. We define 
the advantage Hash-Adv[_4, H] of A to be the probability that 
A breaks H. We say that H. is secure if for all PPT A, Hash- 
Adv[^4, H] is negligible in the security parameter X. 

Example Homomorphic Hash H-DL. This construction is 
based on VH-DL [27 1 but customized to work with our 
augmentation scheme. 

• HashSetup(l A , n): 

- Choose a finite cyclic group G of prime order q > 2 A . 

- Choose generators <?j -f- G \ {1} for i = 1, • • ■ ,n. 

- Output pp := q, (gi, ■ ■ ■ , g n ) and the description of G. 

• Hash(pp,v): 

- Output h := n"=i ex P(s»> v ^)> where exp(a, b) — a b . 
. Test(pp,y, / 9 ) h): If 

n rn 

nex P ( 5 ,,y«)=n ex p( h(i) ^ (i) ) 

i=l i=l 
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then output T; otherwise, output _L. 
The correctness conditions hold as follows: 
(1) For all v e F™, if h = Hash(pp, v) then for all j E [1, to]: 



nex P ( ffl ,v«) =h 



i=l 



] J exp(h 



As a result, Test(pp, v, e^, h) = T . 
(2) For all yi,y 2 € FJ, A,^ 2 ,h € F™, and_ a x ,a 2 e F,, 
let y = aiyi + a 2 y 2 and /3 = ai/3i + a 2 /3 2 . If 
Test(pp, y-j, h) = T for i = 1, 2 then 



i=l 

i«i r 



]Jcxp( 5i ,y w ) = JJ expOi, aiy| 8) + a 2 y 2 l) ) 

n 

J|exp( 5l ,yJ l) ) 

_i=l 

m 

IJexpChW,^) 



][[cxp( 5i ,y 2 l) ) 



.»=l 



nex P (hW,^; 

i=l 
m 

= nex P (h«J). 

i=l 

As a result, Test(pp, y, /3, h) = T. 

Theorem 1. 77ie homomorphic hash "H-DL is secure assuming 
the discrete logarithm problem in G is /iara!. /n particular, 
let A be a PPT adversary that breaks "H-DL, f/zen f/iere 
exists a polynomial-time algorithm B that computes discrete 
logarithms in G such that Hash-Adv[_4, H-DL] < 2 ■ DL- 
Adv[B, G], where DL-Adv[S, G] is the probability that B 



_i=l 



computes discrete logarithms in G (formally defined in [32]) 



Proof (based on proof of VW-DL): If A can break H-DL, 
he can output v*, vi, • • • , v m , and h that satisfy definition |2] 
Thus, 



ncxp( 5l ,v*«) =fjcxp(hW,aug(v* 



Let v = YliLi aug(v*)^' Vj. Since v* is not a linear combina- 
tion of vi, • • • , v m , v* 7^ v. Since Test(pp, v,, e,;, h) = T for 
i = 1, • • • , to, and v is a linear combination of Vi, • • • , v m , 
Test(pp, v, aug(v*), h) = T. This means 

n tci 

I]>xp( 5il v«) = J] eX p(hW,aug(v*)W) 

i=l i=l 

Consequently, A can find two distinct vector v* , v € F™ such 
that 

n n 

nexp( 5l ,v*w)=n ex p(3^ w )- 



Assume A can find this collision with probability e then A can 
be used to compute discrete logarithms in G with probability 
at least e/2 based on Theorem 3.4 in p9| . ■ 



C. Detection Guarantees 

Using the downloaded hashes, all nodes in the network can 
verify the integrity of all downloaded packets on-the-fly. The 
following theorem summarizes the security guarantee of our 
hash-based detection scheme. 

Theorem 2. If a secure homomorphic hash scheme and a 
secure traditional hash function is used in the detection scheme, 
then the probability of a benign node accepting a corrupted 
packet is negligible in the security parameter. 

Proof: For a received packet y, for nodes that are specified 
to perform decoding but cannot recover any source packets or 
nodes that are not specified to perform decoding, they verify 
the integrity of y using the verification of the homomorphic 
hash scheme. Let h = {hi," - ,h m }, where hi,i € [1,to], 
denotes the hash value of the data, Vj, of the source packet 

A corrupted packet is a packet whose data is not in the 
committed source data space; hence, if y is corrupted then y ^ 
span(vi, • ■ • , v m ). As a result, the probability that any node 
N in the network accepts a corrupted packet is upper bounded 
by the probability of breaking the homomorphic hash scheme, 
which is negligible in the security parameter A. 

For a node N that is specified to perform decoding and 
can recover a source packet from the decoding using y and 
previously (verified) received packets, it checks the integrity 
of y through checking the integrity of the newly recovered 
source packet. The probability of accepting a corrupted y is 
now dependent not only on the probability that the newly 
recovered source packet is corrupted but passing the verification 
but also on the probability that some of the previously received 
packets are corrupted but passed the verification. The proof is 
by induction: 

Let negl denote a negligible function. Let p x and p y be the 
probabilities of breaking the traditional hash and homomorphic 
hash functions, respectively. Note that both of these probabil- 
ities are negligible. Let y; denote packet i-th that arrives at 
node N. Let Pr[yj] denote the probability that node N accepts 
a corrupted packet y,. The first packet is either a source packet 
or not, thus N performs either a traditional hash check or 
homomorphic hash check. Hence, 

Pr[yi] =Px+P v = negl 

If the t-th packet is decodable, let y t = Xa=i ctiYi + /3v , 
where v is the newly recovered source packet; y^'s are previ- 
ously received, verified packets; ctj's and j3 are some integer 
coefficients. The probability that y t is corrupted but accepted 
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by N, is 



Pr [yt] = J2 Pr ^] ' Pr t a ^ °1 + p* 



(.= 1 
t-1 

<^Pr[ yi ]+neg] 

i=i 

Since i is upper bounded by c m, where c is some small positive 
integer, and Pr[yi] = negl, Pr[yt] is negligible for all t < cm. 

m 

Finally, our hash-based detection scheme is collusion resis- 
tant because collusion does not help to break the discrete log 
assumption or a secure traditional hash function. 

V. The MAC-Based Defense 
A. Key Observation 

Observation 3. Let us revisit the discussion of applying 



homomorphic MAC scheme in Section III-C From the attack, 
we observe that it is necessary that (i) each source generates 
tags of its packets using its own secret key as opposed to using 
a common key, or (ii) the controller generates all the tags under 
a key secret to all the sources. 

B. Homomorphic Multi-Source MAC (InterMac) 

In this section, we present a novel multi-source homomorphic 
MAC scheme, called InterMac, that allows different sources 
to generate tags using different keys. Nonetheless, the tags are 
combinable, and the malicious nodes cannot generate valid tags 
of corrupted packets. 

Definitions: A (q, n, s, g) multi-source homomorphic MAC 
scheme is defined by four PPT algorithms: 

• Generate(id, fc, II): Input: a source space/generation iden- 
tifier, id; a secret key, k £ JC, and a committed source 
space, II. fc is only known to the trusted controller and 
used for bootstrapping the MAC keys. Output: a key 
set JC = {ki, ■ ■ ■ , k s }. The id is the unique source 
space/generation identifier. Given the committed source 
space ii, the Generate algorithm generates s keys, where 
the i-th key can be used by source i to generate tags for its 
source packets. 

• Sign(fcj,v): Input: key ki £ JC used by source Si and a 
source packet v sent by source Si. Output: tag t of v. 
Let Vj i, • • • , Vi )9 denote source packets sent by source Si. 
The Sign algorithm signs the source space, II, spanned 
by the source packets of all the sources by running 
Sign(id, ki,Vij), for all i G [l,s],j £ [l,g]. 

• Combine((yi,fi,ai), • • • ,(yi,t e ,a e )): Input: £ (£ > 0) 
vectors y lr -- ,y t G F"+ m ; their tags t lr -- ,t e G ¥ q ; 
and their coefficients a,\,--- ,a,t £ ¥ q . Output: tag t of 

def 

vector y = >^ =1 o^y;. 

• Verify(/C, y, t): Input: a key set JC, a vector y G F™ + ™\ and 
its tag t £¥ q . Output: (reject) or 1 (accept). 



Correctness: The scheme must satisfy the following cor- 
rectness requirement: Let IT be the committed source space 
spanned by the committed source packets of all the sources: 
Vi,j, for all i £ [l,s] and j £ [l,g]. Let II' s identifier 
be id. Let k £ JC, and JC = {fci,--- ,k s } be the output 
of Generate given id, k, and II. Let tij = Sign(fc i ,v ij ) 
and ctij £ F„, for all i £ [l,s] and j £ [l,g]. Let 
t = Combine((vi,i,ti,i,Q!i,i), • • • , (v s , 9 , t s>9 , a s , 9 )). Then 



Verify £, 



= 1 , 



= 1 3=1 



Security: We define the security using the following game: 

Attack Game. We consider the following attack game for 
a multi-source homomorphic MAC T = (Generate, Mac, 
Combine, Verify), a challenger C, and an adversary A: 

• Setup: The challenger generates a random key k ^ JC. 

• Queries: A adaptively queries C. Each query is of the form 
(id/, II;), where II/ is a linear subspace represented by a 
basis of m vectors, Vi,j,i £ [l,s],j £ [l,g], and id/ is the 
space identifier. We require that all identifiers id/ submitted 
by A are distinct. To respond to a query for (id/, II/), 
the challenger does the following: Run Generate(id/, fc, II/) 
to produce a key set JCi = {fci,-- - ,k s }. Compute 
Uj = Sign(fc,, Vij), for all i £ [l,s],j £ [l,g]. Send 
(ti,i, • • • , t s , g ) and all keys in JCi but one to A. 

• Output: The adversary A outputs a triplet (id*,y*,t*). We 
consider that the adversary wins the security game if 

(i) id* = id/ for some I, 

(ii) y* i 11/ , and 

(iii) Verify(/C/,y*,<*) = 1. 

Requirement (i) is necessary as corrupted packet is only 
defined when there is a committed source space. Requirement 
(ii) indicates that the output packet by A is indeed a corrupted 
packet. Finally, (iii) indicates that A successfully forges a 
valid tag of the corrupted packet. Let Adv[_4, T] denote the 
probability that A wins the above attack game. We define a 
secure multi-source homomorphic MAC scheme as follows: 

Definition 3. A (q, n, s, g) multi-source homomorphic MAC 
scheme T is secure if and only if for all PPT adversaries A, 
Adv[_4, T] is negligible. 

The Construction of InterMac. We now present our con- 
struction of InterMac. The key ingredient of this construction is 
the generation of the key set JC so that each source can compute 
tags of its source packets using its own key; nonetheless, the 
tags are still combinable. 

• Generate(id, fc, II): 

- Let v^i,-- - ,v S) g G F" +m be the committed source 
packets that span II, and let them be represented as row 
vectors. For each p £ [1, s], let M p be a matrix whose rows 
are vectors in the following set 



1. 



.s;ij^p; j = 1, 



,9}- 
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In other words, M p is a matrix consisted of committed 
source packets of all other sources but source S p . Note that 
rank(Mp) =m — g. Let II m denote the space spanned by 
the rows of M p . 

- The null space of the matrix M p , denoted as Ilj^ , is the 
set of all row vectors z G pj+ rn for which M p z T = 0. For 
any (m — g) x {n + m) matrix M p , we have 

rank(Mp) + nullity(M p ) = n + rn 

known as rank-nullity theorem, where nullity (Afp) is the 
dimension of Ilj^ . Thus, 

dim ( n M p ) =n + m- (m - g) =n + g. 

- Let h»i, • • • , b n+g G F™ +m be a basis of ^m p - This basis 
can be found by solving M p z T = 0. Let F be a Pseudo 
Random Function (PRF): )Cx (Ix [l,s] X [l,n+g]) -> ¥ q , 
where I denotes the domain of the source space identifier. 
To generate key k p for source S p , the controller computes 

or-jf- F(k,\d,p,i) G ¥ q , Vi € [1,71 + 5]. 

- Output: a key set JC = f {fci, • • • ,k s }, where each key, 
k p ,P G [1, s], is generated as above. 

• Sign(fej, v): Outputs t <— fcj • v G ¥ q . 

• Combine((yi,ti, cti), ■ ■ ■ , (ye,tt, a^j): Outputs the sum 

t <- Ei=i «< u € F, . 

• Verify(/C, y, t): Compute i' = y • (fci + • • • + k s ), where 
ki € /C. If t = t', output 1; otherwise, output 0. 

Correctness: Recall from the correctness requirement that 

s g s g 

1 XX "'•.<• XX a ^( v ij ■ k i) ■ 

i—l j — 1 i—1 j—1 

Also, t' computed by the verification algorithm equals 

a g s g 

X X a-i 'i ■ ( ki H — h k ^ = X X a *j( v M ■ • 

i=l j'=l i=l j=l 

Equality (1) is because by construction, for alH =^ p, i G [1, s], 
p G [1, s], and j G [1, g], Vjj • fe p = 0. As computed, t' = t. 

Security: We prove the security of InterMac assuming F is 
a secure PRF. For a PRF adversary B, we let PRF-Adv[S,F] 
denote S's advantage in winning the PRF security game w.r.t. 
F. The definition of the PRF security game is provided in |32|. 

Theorem 3. For any fixed q, n, s, g, InterMac is a secure 
(q, n, s, g) multi-source homomorphic MAC, assuming F is a 
secure PRF. In particular, for every multi-source homomorphic 
MAC adversary A, there is a PRF adversary B who has similar 
running time to A, such that 

KAv[A, InterMac] < PRF-Adv[£, F] + -. 

Proof: The proof is by using a sequence of games denoted 
as Game and 1. Let Wq and W\ denote the events that A 
wins the multi-source homormophic MAC security in Game 



and 1, respectively. Let Game be identical to Attack Game 
1. Hence, 

Pr[VT ] = Adv[A InterMac] . (1) 

In Game 1, the PRF F is replaced by a truly random function, 
i.e., to respond to the queries, the challenger computes k. p — 

Y^i=i ' r i' x -i> where rj ¥ q instead of r; 4 — F(k,\d,p,i). 
Everything else remains the same. Then, there exists a PRF 
adversary B such that 

|Pr[MK ] - Pr[Wi]| = PRF-Adv[6, F] . (2) 

The complete challenger in Game 1 works as follows: 

• Queries: A submits MAC queries (id, II), where IT = 
span(vi i, • • • , v a , g ). For each p G [1, s], C computes a basis 
of ITjtf : Xi, • • • , x n _|_ ff . Then, in order to generate k p , C does 

-n^¥ q , Vie [l,n + g] . 
-k P <-T,?=i 9 nxie¥^+ m . 

In other words, each k p is chosen uniformly at random from 
Il^f , a subspace of size q n+g . The challenger C then computes 
tags for the committed source packets. For i — 1, • • ■ , s and 

3 = V- - ,9, 

Finally, C sends all the tags and all the keys but one to A. 
Without loss of generality, assume that C keeps fci secret to A. 

• Output. A eventually outputs a triplet (id*,y*,t*). Assume 
that id* = id;, for some /. Let JCi — {ki, ■ ■ ■ , k s } denote the 
key set generated for query (id;, II;). The adversary wins the 
game, i.e., event W-y happens, if 

- y* £ II/, and 

- t* = y* • (fci H h k a ) 

Note that the adversary knows ki,-- - ,k s , therefore, if y* • fci 
is known, the adversary will be able to forge a valid t„ . In what 
follows, we will show that y* • k\ is indistinguishable from a 
random value in ¥ q . Let IT; = span(vi i, • • • , v S)5 ). Consider 
the following system of linear equations: 

vi,i • fci = tl,l 

Vl,g ' fci = h t g 

v 2 ,i • fci = 

v s , 9 • fci = 
y* • fci = t* - y* • (fc 2 H h k s ) 

The first sg equations represent all information that the adver- 
sary learns about fci from its query (id/, IT;). Note that since 
y* £ IT;, y* and y itj (i G [l,s],j G [l,.g]) are linearly 
independent. As a result, the above system of equations is 
consistent regardless of the value of t* because the coefficient 
matrix has rank sg + 1 which equals the number of equations. 
Furthermore, for a fixed y*, for any value t* G F_, the solution 
space always has the same size q n+s 9-( s g+ 1 ) — q 71 ^ 1 . Because 
fci is chosen uniformly at random from Uj^j , and all solutions 
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Fig. 3. An example demonstrating the minimum amount of information 
required for carrying out the verification at each receiver when using InterMac. 



to the above system of equations are in II jy- , for a fixed y*, 
its valid tag t* could be any value in ¥ q equally likely. As a 
result, the probability that the adversary chooses a correct t* is 
i. Thus, 

9 



Pr[Wi] = 



1 



(3) 



Equations ([TJ, Q, and |3]l together prove the theorem. ■ 
Theorem [3] expresses that an adversary A can only forge a 
valid tag of a corrupted packet with probability 1 . This security 
guarantee may be unsatisfactory when working with a small 
field, e.g., q = 2 s . Nevertheless, as suggested in fl7| , fl8) , |20) , 
(23), the security can be improved by increasing the field size or 
using multiple tags. When using I tags, the security is \. Note 
that using multiple tags to increase the security is preferable 
as increasing the field size increases the field multiplication 
complexity logarithmically fl&) . 

Remarks. We make the following two important observations 
w.r.t. the verification done in InterMac: (i) a node only needs 
to know the sum of the keys for the verification, and (ii) when 
there is an upper bound M on the number of possible malicious 
sources, it may suffice for a verifying node to know the sum 
of just M + 1 keys to carry out the verification. 

For instance, consider the network given in Fig. [3] There are 
4 source-receiver pairs: (Si, • • • , (54,i?4). As discussed 
in Section III-B in inter-session network coding, a receiver 



does not always receive linear combination of source packets 
from all the sources. Assume that Ri and R 2 only receive 
linear combinations of source packets sent by Si and S2', R3 
only receives combinations of source packets sent by S 2 , S^, 
S4 receives linear combinations of source packets from all the 
sources; and that the maximum number of malicious sources is 
2. Then, the sum of the keys depicted at each receiver in Fig. 
[3] is sufficient for each node to carry out the verification. 

The reason why (ki + k 2 + £3) is sufficient for Ri to verify 
a packet y is twofold: (i) If y is a benign packet, fc 4 ■ y = as 

fc 4 € n^ 4 ; thus, y • (fcH h fc 4 ) = y • (fci H h fc 3 ). As a 

result, Ri does not need to know fc 4 to verify a valid packet, 
(ii) If y is corrupted, since there is at least one key secret to 
the adversary (M = 2), we can use the same line of arguments 
as in the proof of Theorem [3] to show that the probability of 
forging a valid MAC tag for y is only -. 



Showing that the other sums are sufficient for R 2 , R3, and 
i?4 can be done with similar arguments. Having different sums 
for verification at different receivers decreases the damage done 
by the adversary who could compromise some of the receivers. 
We discuss this in detail at the end of Section IV-DI 

C. Efficient Commitment 

The role of the committed source packets in InterMac is to 
enable the controller to generate vectors (MAC keys) that are 
orthogonal to the committed space (IIm 's). Here, we design 
a more efficient commitment scheme that does not require 
each source to send all their source packets to the controller, 
but it still allows the controller to generate these orthogonal 
vectors. To this end, we leverage two key techniques: padding 
for orthogonality and private inner product computation. 

The padding for orthogonality technique was originally intro- 
duced in p0| to make a random vector orthogonal to all source 
packets of multiple generations by padding to each source 
packet an additional element. We apply this technique to make 
a random vector chosen by the controller, which will serve as 
a MAC key, orthogonal to the required subspace (Hm p )- In 
addition, we use the private inner product protocol proposed in 
pT[ to allow the controller to compute the padding elements 
while keeping the random chosen vector private. 

Private Inner Product Protocol. Let £ = (Gen, Enc, Dec) be 
a semantically secure homomorphic public-key cryptosystem. 
In general, the private inner product protocol (PIP) proposed 
in pT) works with various public -key cryptosystems that have 
the following homomorphic properties: 

• Dec ( Enc(mi) Enc(m2) ) = Ttii + m 2 , and 

• Dec ( Enc(r7i 1 )™ 12 ) = mi m 2 . 

Popular cryptosystems that possess the above properties in- 
clude Goldwasser-Micali |33|, Paillier [34|, and Benaloh (35) 
cryptosystems. However, not all of them are suited for our 
task. Specifically, in Paillier system, the plaintext must be in 
Z„, where q is a product of two large primes, making 7L q 
not a finite field; this system thus does not fit our setting. 
In Goldwasser-Micali system, the plaintext domain is F 2 and 
could be extended to ¥ 2t |40) ; however, the expansion factor, 
i.e., the ratio between the size of the ciphertext and the plaintext, 
is very high (up to hundreds), making it not suitable for our 
purpose. Benaloh system is an extension of Goldwasser-Micali 
system: it reduces the expansion factor significantly; moreover, 
its plaintext domain could be a finite field Z q , where q is prime. 
Therefore, we choose this system in our instantiation of the PIP 
protocol. 

Let q be prime, so that ¥ q is isomorphic to Z q . Let r = 
(ri, ■ • • , r n ) be a random vector chosen by the controller C, 
and v = , v n ) be a source vector of source S. C and 

S carry out the PIP protocol described in Table [I] With PIP, 
C can learn the inner product r ■ v while S does not learn 
any information about r, thanks to the security guarantee of 
the encryption. Indeed, Goethals et al. pTJ showed that this 
protocol is secure in the semi-honest model, where it is assumed 
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Private Inputs: Private vectors r, v e F™. 
Private Outputs: Inner product r • v e ¥ q . 

1. Setup phase. The controller C does: 

Generate a private and public key pair (sk, pk). 
Send pk to S. 

2. The controller C does for i 6 [l,n]: 

Send Ci = Enc p k(ri) to S. 

3. The source S does: 

Send w n"=i C T t0 

4. The source 5 does: 

Compute r ■ v = Dec s k(w). 



TABLE I 

Private Inner Product (PIP) Protocol 



that both parties follow the protocol, but they are curious and 
try to deduce information from all exchanged data. 

Commitment, Padding, and Key Generation (CPK) Proto- 
col. Let k e IC and F be a PRF: JC x (1 x [1, s] x [n + s - 
1 + m]) — > F . Each source packet will be padded with s — 1 
elements. Using PIP, the controller C generates the MAC keys 
and computes the padding as follows: 

1. Setup: Let id be the subspace identifier. For i 6 [l,s] and 
j G [1, n + s — 1 + m], C computes rf' <— F(k, id, 

Let r, = (rf \ • • • , r ^ l+s - 1+,n) ) and f, = (rf \ • • ■ ~ (n) 

2. Commitment: For each i € [1, s], C and Si',i' E [1 
carry out the PIP protocol so that C learns fj • Vj'j, Vj € 
[1,<?]. The encryption of these dot products sent from each 
source to the controller in the PIP protocol represent the 
commitment made by the sources. 



T - 



Padding: Let pf^ , ■ • • ,pj *• denote the padding elements 
for a source packet Vjj- sent by source Si. The padded 
source packet, denoted by pij, has the following form: 



(V- ■ 




-5— 1+m 



The padding elements are computed by solving the follow- 
ing system of s — 1 linear equations: 

{T V ■ Pij = 0}i, 6 [i >s] \ { ; } (4) 

For s > 1, this system has s — 1 unknowns and consists 
of s — 1 linearly independent equations. Therefore, there 



is a unique solution for p. 



O-i) 



C then sends the 



padding elements to Si. Si now sends pij instead of Vjj. 
4. MAC fceys: C uses r t as MAC key fc,. Equations in Q 

ensure that the chosen key fc, is orthogonal to 11^/;. 
When using the CPK protocol, the sources no longer need 
to send all of their source packets to the controller. Instead, 
they only need to send an encryption of the inner product 
for every source packet, thereby significantly reducing the 
communication cost. Fig. 
network shown in Fig. [T 



4] illustrates the CPK protocol for the 
We use InterMaccpK to denote the 



Enc(r 2 ) 

ri,r 2 

®i' Enc(r 2 'V!) 



Enc(?i) 

Fnc(f, ' Vi ~~'~r^) 



Fig. 4. Keys generation using the CPK protocol for the network of Fig.[T] ki 
is orthogonal to the padded vector p2,i and fc2 is orthogonal to pi,i thanks 
to the padding. At the same time, ki is secret to 52 and k^ is secret to Si. 



InterMac construction when using the CPK protocol to generate 
MAC keys instead of Generate. 

Security. The security of InterMaccpK in the semi-honest 
model comes from the security of PIP and InterMac. 

Let GeneratecpK denote a new generation algorithm that 
takes as input a key k 6 IC and an identifier id, and generates 
MAC keys using the CPK protocol. Let InterMaccpK denote 
the new InterMac construction with GeneratecpK- Consider 



Attack Game 1, previously described in Section V-B with the 

following modified query step: 

• Queries: The adversary A chooses a subspace II and 
its identifier id, then sends (id) to the challenger C. A 
can make a polynomial number of queries. To response 
to a query id;, C initiates the CPK protocol with A to 



computes the MAC key set /C;. Let p. 



(i) 

id 



,Pi*j denote 

the padding elements of the source packet Vj j sent by 
source Si. For each padded source vector Pij, C can also 
compute its MAC tag under key fc,; = r^: 



ti.j r^ • p 



+ r. 



, (1) (n+1) 
+ »- ■ r ■ 

(n+s-l+g(i-l)+j) 



(s-l) (n+s-1) 



Finally, C sends all the tags and all the MAC keys in K,\ 
but one to A. 

The setup step, output step, and the winning conditions remain 
the same. The definition of security for multi-source homomor- 
phic MAC is now with respect to the above modified attack 
game. Let Enc-Adv[£>2, £] be the probability that £>2 has more 
than a random guess to output the correct bit of the public-key 
encryption security experiment PubK^ av . We refer the reader to 
(32J for the experiment. 



Theorem 4. For any fixed q, n, s, g, InterMaccpK is a 

secure (q, n, s, g) multi-source homomorphic MAC in the 
semi-honest model, assuming F is a secure PRF and £ is a 
semantically secure public-key encryption. In particular, for 
every multi-source homomorphic MAC adversary A, there is 
a PRF adversary B\ and a public-key encryption adversary B2 
who have similar running time to A, such that 

Adv[A, lnterMac CPK ] < PRF-Adv[Bi, F]+Enc-Adv[B 2) £]+- ■ 

Proof: The proof is by using a sequence of games denoted 
as Game 0, 1, and 2. Let Wq, W\ and W% denote the events 
that A wins the multi-source homormophic MAC security in 



LO 



Game 0, 1, and 2, respectively. Let Game be identical to the 
modified Attack Game 0. Hence, 



Enc(r) 



Enc(f) 



Pt[W ] = AdvL4, InterMaccpK] 



(5) 



In Game 1, the PRF F is replaced by a truly random function, 



i.e., in the CPK setup, the challenger computes r 



F„ 



instead of r 



0) 



F(k,\d,i, j). Everything else remains the 
same. Then, there exists a PRF adversary B\ such that 

|Pr[W ] - Pr[Wi]| = PRF-Adv[Si , F] . (6) 

In Game 2, the encryption £ is replaced with a per- 
fect encryption scheme, i.e., the encryption is information- 
theoretically secure. There exists an encryption adversary B2 
such that 



|Pr[Wi] - Pr[W 2 ]| = Enc-Adv[£ 2 ,£] 



(7) 



Note that in Game 2, (i) in the semi-honest model, the ad- 
versary follow the CPK protocol; (ii) the encryptions sent from 
the challenger give no information about the random chosen 
vectors, r^'s, to the adversary; and (iii) r^'s are indistinguishable 
from vectors chosen uniformly at random from Y q l+s ~ 1+m . 
Following the same line of argument as in the proof of Theorem 

8 ives 

Pr[W 2 ] = - . (8) 

q 

Equations Q, d6j, |7]), and ([H} together prove the theorem. 

■ 

In a stronger threat model, where malicious sources may 
not follow the protocol, the security guarantee of InterMaccpK 
could still be achieved by adding appropriate controller's re- 
sponses for malicious behaviors. Malicious behaviors of the 
sources are limited to (i) not sending a well-formed encryption 
back for each query of C, and (ii) not padding the source 
packets appropriately. For (i), the controller could exclude any 
source with this behavior from the source list and only calculate 
MAC keys for the remaining sources. For (ii), not-properly 
padded packets will be dropped with high probability as they 
are highly likely to be outside of the committed source space. 

D. Private Inner Product MAC 

InterMac explores the first direction of Observation 3, which 
suggests different sources should use different keys. In this 
section, we explore the other direction, which suggests that all 
tags of the source packets be generated by the trusted controller 
instead of the sources, and the MAC key be secret to the 
sources. In particular, we show how the PIP protocol could 
be combined with SpaceMac previously proposed for intra- 
session network coding [25] to provide an alternative MAC- 
based scheme for detecting corrupted packets. 

SpaceMac consists of a triplet of algorithms: Mac, Combine, 
and Verify. The construction of SpaceMac uses a PRF F : 
K. x (I x [1, n + m]) —> W q and is as follows: 

• Mac(fc, id, y): The tag t £ ¥ q of an input vector y £ F"+ m 
is computed by the following steps: 



(sj Enc(p -*' : '"'}{£<"" Enc(f '* 2) 

***** _ - ' ~ ~ ~ - ^ t2,l 



Fig. 5. Tags generation using the PM protocol for the network of Fig.[T] The 
tags are generated by the controller and the key is secret to the source. 

- r <- (F(k, id, 1), • ■ ■ , F(k, id, n + to)) . 

- t <- y • r G ¥ q . 

• Combine((yi,£i,ai), • • • ,(y e ,ti,a e )): The tag t G ¥ q of 
y == Si=i a i yi e F™ +m is computed as follows: 

- * <- E*=i on U G Fg . 

• Verify(fc, id, y, t): To verify if t is a valid tag of y using 
key k, we do the following: 

- r 4- (F(k, id, 1), • • • , F(k, id, n + m)) . 
-t <- y ■ r . 

- If t' = t, output 1 (accept); otherwise, output (reject). 

Private MAC (PM) Protocol. The controller and the sources 
carry the PM protocol to compute tags of the source packets. 
The PM protocol consists of the following steps: 
1 . Setup: Let id be the current subspace identifier. C computes 



„(*) 



(F(k, id, i), Vi G [l,n+m]. Let f = (r (1) , • 



and r = ( 



= ft-M 



... <p 



(n+m)\ 



2. Commitment: For each i G [1, s], C and Si carry out the 
PIP protocol that allows C to learn r • Vjj, Vj £ The 
encryption of the inner products sent by the sources to the 
controller are the commitment. 

3. MAC tags: For v^-, C computes its tag ijj = r • v 



1 -j 



Note that PM helps the controller compute the tags on behalf 
of the sources without leaking the MAC key. Fig. [5] illustrates 
how the SpaceMac MAC tags are computed for the network 
shown in Fig.[T] We use SpaceMac PM to denote the SpaceMac 
scheme when used with the PM protocol to generate tags for 
the source packets as opposed to the Mac algorithm. 

Security. The security of SpaceMac PM in the semi-honest 
model comes from the security of PIP and SpaceMac. Below, 
we analyze the security of SpaceMac when used with the PM 
protocol. 

Attack Game 2. We consider the following attack game for a 
homomorphic MAC T = (Mac, Combine, Verify), a challenger 
C, and an adversary A: 

• Setup: C generates a random key k K . 

• Queries. The adversary A chooses a subspace II and its 
identifier id, then sends (id) to the challenger C. A can 
make a polynomial number of queries. To response to a 
query id;, C initiates the PM protocol to compute tags of 
all source packets. C then sends all the tags to A. 

• Output. The adversary A outputs a triplet (id*, y„, t#). We 
consider that the adversary wins the security game if 

(i) id* = id; for some I, 

(ii) y* ^ IT;, and 
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(iii) Verify(fc, id;,y*,i*) = 1. 

Let Adv[„4, T] denote the probability that A wins the above 
attack game. We define a secure homomorphic MAC scheme 
as follows: 

Definition 4. A (q, n, m) homomorphic MAC scheme T is 
secure if for all probabilistic polynomial-time adversaries A 
Adv[.4, T] is negligible. 

Let SpaceMac PM denote the SpaceMac scheme when used 
with the PM protocol to generate tags for the source packets as 
opposed to the Mac algorithm. The security of SpaceMac PM is 
given by the following theorem: 

Theorem 5. For any fixed q, n, m, SpaceMac PM is a secure (q, 
n, m) homomorphic MAC in the semi-honest model, assuming 
F is a secure PRF and £ is a semantically secure public- 
key encryption. In particular, for every homomorphic MAC 
adversary A there is a PRF adversary B\ and a public-key 
encryption adversary Bi who have similar running time to A 
such that 

AdvW, SpaceMac PM ] < PRF-Adv[Si, F]+Enc-Adv[2? 2 , £] + - ■ 

q 

Proof: The proof is by using a sequence of games denoted 
as Game 0, 1, and 2. Let Wo, W\ and W<i denote the events 
that A wins the homormophic MAC security in Game 0, 1, and 
2, respectively. Let Game be identical to the Attack Game 2. 
Hence, 



Pr[W ] = AdvL4, SpaceMac P 



(9) 



In Game 1, the PRF F is replaced by a truly random function, 
i.e., in the PM setup, the challenger computes <— ¥ q instead 
of rW <=— F(k,\d,i). Everything else remains the same. Then, 
there exists a PRF adversary B\ such that 



|Pr[iy ] - Pr[Wi]| = PRF-Adv[#i, F\ 



(10) 



In Game 2, the encryption £ is replaced with a per- 
fect encryption scheme, i.e., the encryption is information- 
theoretically secure. There exists an encryption adversary B2 
such that 



|Pr[Wi] - Pr[W 2 ]| = Enc-Adv[B 2 ,£] 



(11) 



Note that in Game 2, (i) in the semi-honest model, the 
adversary follow the PM protocol; (ii) the encryptions sent 
from the challenger give no information about the random 
chosen vector, r, to the adversary; and (iii) r is indistinguishable 
from a vector chosen uniformly at random from F™ +m . Let 
vi,--- , v m be the source packets that span II; (recall that 
id* = id/ for some I). Consider the following system of m + 1 
equations: 

r • Vi = ti 

r ■ v m t m 
r • y* = U 



The adversary learns the first m equations from its query, and it 
wins the security game if the last equation is valid and y* ^ II; . 
This system of equations is consistent regardless of the value 
of because the coefficient matrix has rank m + 1, which 
equals the number of equations. Furthermore, for any value t*, 
the solution space always has the same size q n ~ x . Thus, for 
a fixed y*, its valid tag could be any value in ¥ q equally 
likely, given that r is chosen uniformly at random from F™ + ™. 
As a result, the probability that the adversary chooses a correct 
t* for any y* is -, i.e., 



Vr\W 2 



(12) 



Equations ((5), ([T0|, ([TT 



and (12 1 together prove the theo- 
rem. ■ 
We note that the security of SpaceMac PM can also be 
extended to the malicious model, where there are sources that 
may not follow the PM protocol. In this model, a malicious 
source Si is limited to not sending back an encryption (of 
the inner product of r and the appropriate Vjj) or sending 
back a mal-form encryption. In response to these behaviors, 
the controller can ignore v,j in its tag computation and thus, 
do not send the tag of Vjj back to Si. The source Si, without 



k, will not be able to generate a valid tag for 
l , 3 (unless Vj j is a linear combination of vectors with already 



knowing the key. 
v 

known tags). 



Comparison. Compared to InterMaccpK, SpaceMac PM is sim- 
pler in terms of initialization. This is because InterMaccpK 
operates on s MAC keys instead of one key. InterMaccpK 
and SpaceMac PM have similar efficient Combine and Verify 
operations as both of them only involve simple field addition 
and multiplication as opposed to exponentiation. When using 
SpaceMac PM , all receivers must know the MAC key k in order 
to verify their received packets. As a result, as soon as an 
adversary compromises a receiver and learns k, it can fool all 
other receivers into accepting corrupted packets. We stress that 
this is not necessarily the case when using InterMaccpK- For 
instance, consider Fig. [3] Assume that Si and S2 are malicious, 
thus keys k± and fc 2 are leaked. If the adversary compromises 
Ri, it learns fc 3 by subtracting the sum (ki + fc 2 + k^) 
from (fci + £2). However, it still cannot fool R2, R3, or R4 
into accepting a corrupted packet as the verification at these 
receivers involves £4, which is still secret to the adversary. 

InterMaccpK and SpaceMac PM , as described, could be used 
as a drop-in replacement for traditional MACs, e.g., HMAC, 
for networks that use inter-session network coding: they allow 
the receivers to detect corrupted packets. As when using a 
traditional MAC scheme, we assume the keys distribution is 
through secure (athentic and private) channels. We also assume 
the communication between the sources and the controllers 
in the CPK and PM protocols is through athentic channels. 
In fact, compromising any node but R4 does not help the 
adversary to break the verification of any additional receiver, 
and compromising R4 only allows the adversary to break the 
verification of one additional receiver, R 3 , but not all. 
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E. In-Network Detection 

Both of our MAC schemes could be extended to provide 
in-network detection by adopting state-of-the-art techniques 
proposed for intra-session network coding. We discuss two 
main options below: 



Delayed Key Disclosure (TESLA) [36]: This approach leverages 
the time dimension to achieve broadcast authentication and has 
been adapted to intra-session network coding to provide in- 
network detection fTT) , 1 18 1, |24|. In this approach, nodes are 
required to loosely synchronize their time. Both InterMaccpK 
and SpaceMac PM could be used with the approaches proposed 
in p"8| and (24) to provide in-network detection for fixed 
directed acyclic networks and dynamic peer-to-peer networks, 
respectively. We note that the detection schemes based on JT8 
(24) are fully collusion resistant and tag-pollution resistant (an 
attack on MAC-based schemes that use multiple tags (18)) 



Cover-Free Set Systems /.Ul: This approach leverages cover- 
free set systems to probabilistically distribute keys to all nodes 
such that any collusion of c nodes or less does not leak all the 
keys used in the whole system. This approach has been adapted 
to intra-session network coding to provide in-network detection 
(20), (27). Both InterMaccpK and SpaceMac PM are suitable 



to be used with this approach. Detection schemes based on 
(20) , (27) are c-collusion resistant. To address tag pollution, we 
propose using our homomorphic hash-based detection scheme 
to protect the coding coefficients and the tags of the packets. 
This technique is motivated by the hybrid scheme MacSig 
proposed by Zhang et al. |20|, where a homomorphic signature 
scheme is used to protect the coding coefficients and the tags. 

VI. Performance Evaluation 

A. Bandwidth Overhead 

We compute the bandwidth overhead directly from the num- 
ber of packets, hashes, and MAC tags described in our schemes. 

1 ) Hash-Based Detection: Our hash-based scheme does not 
incur any online bandwidth overhead per packet as there 
is no additional symbol attached to each packet. The off- 
line bandwidth overhead of this scheme is dominated by the 
bandwidth required to distribute both the homomorphic and 
traditional hashes from the controller to all the nodes. The size 
of a homomorphic hash is \q\. Let \h\ denote the size of the 
traditional hash (for SHA-1, |/j|=160 bits). The total off-line 
bandwidth overhead is sg|£?|(|/i| + 

2) MAC-Based Detection: The off-line bandwidth overhead 
of InterMaccpK and SpaceMac PM come from the packets ex- 
changed during the execution of the CPK and PM protocols. 
The off-line bandwidth overhead of InterMaccpK includes the 
overhead of the encryptions of the randomly chosen vectors sent 
by the controller, the encryptions of the inner products sent 
back by the sources, and the padding sent by the controller, 
which is s(s — l)(ne|<7| + ge | <7 1 ) + sg(s — l)\q\, where e 
is the expansion factor of the encryption scheme and equals 
1^1 (N is the size of the modulo of the encryption in bits). 
The off-line bandwidth overhead of SpaceMac PM includes the 




InterMaccpK 
SpaceMac PM 



32 48 64 96 128 

Field Size (bits) 



25(:i 



\ I ' Fig. 6. Percentage of bandwidth saved with PIP as a function of field size. 



overhead of the encryption of the randomly chosen vector and 
the encryptions of the inner products, which is s(ne|g| +ge|q|). 
To be concrete, for N — 256, n — 1024, s — 5, and 
g = 100, the off-line bandwidth overhead per source packet 
of InterMaccpK and SpaceMac PM range from 36% to 1% as 
the field size increases from 32 to 256 bits. Fig. [6] shows 
the percentage of bandwidth saved when using PIP for the 
commitment as opposed to the sources sending all source 
packets to the controller. As shown in the Fig. [6] the percentage 
of bandwidth saved increases as the field size increases. When 
|g| > 128, the percentage of saving is larger than 90% for both 
InterMaccpK and SpaceMac PM . The saving could be as much 
as 99% for SpaceMac PM when \q\ = N = 256. 

The online overhead comes from the tags accompanied with 
each packet. To provide end-to-end detection, using a single 
tag suffices. In this case, the overhead of both InterMaccpK 
and SpaceMac PM is JtM0.1% for n = 1024). To provide in- 
network detection for a directed acyclic network, let one of 
our MAC schemes be used with the delayed key disclosure 
technique in RIPPLE |18[[ Let So be a virtual node which 
has an edge pointing toward every source node. Define a level 
of a node as the length of the longest path from Sq to the node. 
Let L be the maximum among the levels of the nodes. Each 
packet carries L MAC tags initially; then one or more tags are 
peeled off at every node the packet goes through. The average 
online overhead per packet is MrK %. 

In comparison, on average, the online overhead per packet of 
1 27 1 is ;|fep^ %> where \a\ is the size of a regular public key 
f ignature. We stress that this overhead depends on the number 
of source packets whereas ours does not. To be concrete, if we 
set L = 16 (as in (18)), \a\ = 320 (DSA), |g| = 128, s = 5, 
g = 100, then the overhead per packet of (27) is 32 times 
larger than ours (^xj^p^- — ^2). Fig. 7 plots the average 
online overhead per packet of (27) , a state-of-the-art intra- 
session detection scheme (20) , and our MAC-based scheme as 
a function of packet length. The range of the packet length 



When using InterMaccpK, the delayed MAC keys must be verified differ- 
ently, i.e., using public key verification instead of one-way key chain. 
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Fig. 7. Per-packet bandwidth overhead of the homomorphic signature scheme 
(27) , the hybrid scheme in (15) (e = 1,8 = 0.1, e = 0.01), and our 
lnterMac C pK/SpaceMac PM when used with RIPPLE fl8). 




500 600 700 800 900 1,000 
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Fig. 8. Per-packet per-node computation overhead of the signature scheme 
(27| , the hybrid scheme in |20| (c = 1,5 = 0.1, e = 0.01), our hash-based 
scheme, and lnterMaccpK/SpaceMac PM when used with RIPPLE [18]. 



is chosen according to [20] for ease of comparison. This plot 
shows that not only is our overhead significantly smaller than 
that of |(27), but it is also small, as small as 3%. Our overhead 
is comparable to that of |20|. 

B. Computation Overhead 

We focus on the online overhead incurred by the operation 
performed at each node per packet and neglect the other over- 
head, e.g., computing the hashes and MAC keys, as these are 
negligible in the number of packets in the network. Similar to 
(20) , we calculate the computation overhead by approximating 
various operations by the number of finite field multiplications. 
To calculate the computation time, for ease of comparison, we 
adopt the benchmark obtained in [20] on a 2.0 GHz Intel Core 
2 CPU, where approximately 2.5 x 10 5 multiplications can be 
performed per second for | gj =128. 

1) Hash-Based Detection: For each packet, the worst case 
scenario is that the node needs to perform a homomorphic 
hash check, i.e., performing the Test algorithm of H-DL. 
This algorithm entails n + m modular exponentiations (recall 
m = sg). In comparison, in the worst case, the scheme in 
(27) requires n + m exponentiations plus s public-key signature 
verifications. In the best scenario, where the received packet is 
decodable, our scheme just requires a traditional hash check. 

2) MAC-Based Detection: Let one of our MAC schemes 
be used with RIPPLE as described in Section [VI-AI For each 
packet, the overhead includes one Combine (to generate the 
tag of the packet) and one Verify (to verify the integrity of the 
packet). Let w be the average number of packets combined by 
each node. Then, on average, the Combine algorithm entails 
w(^^) multiplications; meanwhile, the Verify algorithm en- 
tails n + m + multiplications. The total average overhead 
is w{^^) + (n + m + multiplications. 

In comparison, the average overhead of [27] is n + ^ 
exponentiations plus | public-key verification. For simplicity, 
approximate the cost of one public -key verification (DSA) 
by two modular exponentiations. Utilizing the "square and 



multiple" method for calculating exponentiation over a finite 
field F„, each exponentiation over ¥ q takes approximately ||q| 
multiplications on average (20") . The total average overhead is 
§|<z|(Vi+ y + s) field multiplications. 

For concreteness, let L = 16, w — 4, n — 1024, s = 5, g = 
100, and \q\ = 128. We approximate a traditional hash check 
by 80 field multiplications (1 per iteration of SHA-1) and let the 
decodable probability be 50%. Fig. [8] plots the average online 
computation overhead per packet per node of the signature- 
based scheme in (27), the intra-session detection scheme in 
p0| , and our hash-based and MAC-based schemes. This plot 
shows that the overhead of our hash-based scheme is half of 
that of (27). The computation efficiency would increase with 
the decodable probability. The plot also demonstrates that the 
overhead of our MAC-based scheme is small, ranging from 4 
to 6 ms, and is two orders of magnitude less than the that of 
(27) and (20). 

VII. Conclusion 

In this work, we introduce three efficient schemes to detect 
pollution attacks in inter-session network coding. The central 
idea of our schemes is the use of commitment of source packets. 
Our first scheme is a novel combination of homomorphic and 
traditional hash functions. The other two schemes are novel 
MAC schemes for inter-session network coding: InterMaccpK 
and SpaceMac PM . To the best of our knowledge, InterMaccpK is 
the first multi-source homomorphic MAC scheme that support 
multiple keys. Except when using one-hop decoding, e.g., 
COPE, we recommend using detection schemes built on our 
MAC schemes as they have significantly lower computation 
overhead. Finally, we recommend using InterMaccpK over 
SpaceMac PM when there may be malicious receivers. 

References 

[1] S. Katti, H. Rahul, W. Hu, D. Katabi, M. Medard, and J. Crowcroft, "XORs in the 
Air: Practical Wireless Network Coding," in SIGCOMMV6, 2006. 

[2] S. Omiwade, R. Zheng, and C. Hua, "Butteflies in the Mesh: Lightweight Localized 
Wireless Network Coding," in NetCod'08, 2008. 



14 



[3] Y. Feng, Z. Liu, and B. Li, "GestureFlow : Streaming Gestures to an Audience," in [22] 

IEEE INFOCOM'll, 2011. 
[4] N. Cai and R. W. Yeung, "Secure Network Coding" in ISIT02, 2002. [23] 
[5] Z. Zhang, "Network Error Correction Coding in Packetized Networks," in Info 

Theory Workshop, 2006. 
[6] S. Jaggi, M. Langberg, S. Katti, T. Ho, D. Katabi, and M. Medard, "Resilient [24] 

Network Coding in the Presence of Byzantine Adversaries," in INFOCOM'07. 
[7] R. Koetter and F. R. Kschischang, "Coding for Errors and Erasures in Random [25] 

Network Coding," in ISIT'07, 2007. 
[8] T. Ho, B. Leong, R. Koetter, M. Medard, M. Effros, and D. R. Karger, "Byzantine [26] 

Modification Detection in Multicast Networks using Randomized Network Coding," 

in ISIT'04, 2004. [27] 
[9] E. Kehdi and B. Li, "Null Keys : Limiting Malicious Attacks Via Null Space 

Properties of Network Coding " in INFOCOM'09, 2009. [28] 
[10] Z. Yu, Y. Wei, B. Ramkumar, and Y Guan, "An Efficient Scheme for Securing 

XOR Network Coding against Pollution Attacks," in INFOCOM'09, 2009. [29] 
[11] J. Dong, R. Curtmola, and C. Nita-Rotaru, "Practical Defenses Against Pollution 

Attacks in Intra-Flow Network Coding for Wireless Mesh Networks," in WiSec'09. [30] 
[12] C. Gkantsidis and P. R. Rodriguez, "Cooperative Security for Network Coding File 

Distribution," in INFOCOM '06, 2006. [31] 
[13] Q. Li, D.-M. Chiu, and J. C. Lui, "On the practical and security issues of batch 

content distribution via network coding," in ICNP'06, 2006. [32] 
[14] F. Zhao, T. Kalkert, M. Medard, and K. J. Han, "Signatures for Content Distribution 

with Network Coding," in ISIT'07, 2007. [33] 
[15] D. Charles, K. Jain, and K. Lauter, "Signatures for network coding," in Info Sciences 

and Systems, vol. 1, no. 1, 2006. [34] 
[16] D. Boneh, D. Freeman, J. Katz, and B. Waters, "Signing a Linear Subspace : 

Signature Schemes for Network Coding," in PKC'09, 2009. [35] 
[17] S. Agrawal and D. Boneh, "Homomorphic MACs : MAC-Based Integrity for 

Network Coding," in ACNS'09, 2009. [36] 
[18] Y Li, H. Yao, M. Chen, S. Jaggi, and A. Rosen, "RIPPLE Authentication for 

Network Coding," in INFOCOM'10, 2010. [37] 
[19] Y Jiang, H. Zhu, M. Shi, X. S. Shen, and C. Lin, "An efficient dynamic-identity 

based signature scheme for secure network coding," Computer Networks, vol. 54, [38] 

no. 1, pp. 28^10, Jan. 2010. 
[20] P. Zhang, Y. Jiang, C. Lin, H. Yao, A. Wasef, and X. S. Shen, "Padding for Orthog- [39] 

onality : Efficient Subspace Authentication for Network Coding," in INFOCOM'll. 
[21] M. Jafarisiavoshani, C. Fragouli, and S. Diggavi, "On Locating Byzantine Attack- [40] 

ers," in NetCod'08, 2008. 



Q. Wang, L. Vu, K. Nahrstedt, and H. Khurana, "Identifying Malicious Nodes in 
Network- Coding- Based Peer-to-Peer Streaming Networks," in Mini INFOCOM' 10. 

A. Le and A. Markopoulou, "Cooperative Defense Against Pollution Attacks 
in Network Coding Using SpaceMac," in Technical Report. [Online]. Available: 
http://arxiv.org/abs/1102.3504 

, "TESLA-Based Defense Against Pollution Attacks in P2P Systems with 

Network Coding," in NetCod'll, 2011. 

, "Locating Byzantine Attackers in Intra-Session Network Coding using 

SpaceMac," in NetCod'10, 2010. 

, "On Detecting Pollution Attacks in Inter-Session Network Coding," in 

Technical Report. [Online]. Available: TBA 

S. Agrawal, D. Boneh, X. Boyen, and D. Freeman, "Preventing Pollution Attacks 
in Multi-Source Network Coding," in PKC'10, 2010. 

W. Yan, M. Yang, L. Li, and H. Fang, "Short Signatures for Multi-source Network 
Coding," in MINES' 09, 2009. 

J. Dong, R. Curtmola, C. Nita-Rotaru, and D. Yau, "Pollution Attacks and Defenses 
in Wireless Inter-flow Network Coding Systems," in WiNC'10, 2010. 
M. N. Krohn, M. J. Freedman, and D. Mazieres, "On-the-Fly Verification of Rateless 
Erasure Codes for Efficient Content Distribution," in SP'04, 2004. 

B. Goethals, S. Laur, H. Lipmaa, and T. Mielikainen, "On Private Scalar Product 
Computation for Privacy-Preserving Data Mining," in ICISC'04, 2004. 

J. Katz and Y. Lindell, Introduction to Modern Cryptography. Chapman & 
Hall/CRC Press, 2007. 

S. Goldwasser and S. Micali, "Probabilistic Encryption," Journal of Computer and 
System Sciences, vol. 28, pp. 270-299, 1984. 

P. Paillier, "Public-Key Cryptosystems Based on Composite Degree Residuosity 
Classes," in EUROCRYPT'99, 1999. 

J. Benaloh, "Desnse probabilistic encryption," in Workshop on Selected Areas of 
Cryptography, vol. 28, no. 2, pp. 120-128, Apr. 1994. 

A. Perrig, R. Canetti, J. D. Tygar, and D. Song, "The TESLA Broadcast Authenti- 
cation Protocol," RSA CryptoBytes, vol. 5, 2002. 

R. Canetti, J. Garayt, G. Itkid, D. Micciancios, M. Naore, and B. Pinkasll, "Multicast 
security: a taxonomy and some efficient constructions," in INFOCOM '99. 
P. L. Montgomery, "Modular Multiplication Without Trial Division," Mathematics 
of Computation, vol. 44, no. 170, p. 519, Apr. 1985. 

M. Bellare, O. Goldreich, and S. Goldwasser, "Incremental Cryptography : The Case 
of Hashing and Signing," in Advances in Cryptology, vol. 839, 1994, pp. 216-233. 
M. Franklin and P. Mohassel, "Efficient and Secure Evaluation of Multivariate 
Polynomials and Applications," in ACNS'10, Beijng, China, 2010, pp. 236-254. 



